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(54) Speech . recognition 

(57) Words uttered by a user are compared 2, with stored words 3; the word giving the best "score" in 
the comparison is deemed to have been recognised. Where equal or similar scores occur the result is 
ambiguous and in that case a message is generated (eg by means of a speech synthesiser 7) containing a 
word for the user to confirm. If he does not a second word may sinnilariy be offered. 
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SPECIFICATION " 
Speech recognition 

5 The present invention relates to speech recognition systems, in such systems words uttered are 5 
subjected to known pattern recognition techniques and if correspondence with a known word is . 
found, suitable coded signals are generated identifying the word. Con-espondence is generally 
determined by generating signals or "scores" indicating the degree of similarity with stored 
patterns corresponding to known words; the word having the best score is deemed to be the 

10 word uttered. This technique fails, however, if an ambiguous result is obtained (le if two scores lO 
are obtained which are the same or differ only by a small amount). Normally in an interactive 
arrangement the remedy is for the recognition system to respond by presenting the user with a- 
request to repeat the word in question. u u u r*,. ♦u^ 

However, this approach suffers from the disadvantage that there is a high probabifity of the 

15 ambiguity recurring; also it can be irksome for the user. 

According to the present invention, therefore, there is provided a speech recognition apparatus 
comprising analysis means for receiving speech signals from a user, companng each received 
word with stored representations of words to produce similarity signals indicating the degree of 
correspondence between them, and producing coded signals identifying recognised words, and 

20 output means, for presenting messages to the user, the analysis means being operable in the ZO 
event that the similarity signals in respect of a first stored representation to which a received 
word most closely corresponds is equal, to, or differs by less than a predetermined margin from, 
the similarity signal in respect of a second stored representation to ^ u « 

(a) generate via the output means a message including the word represented by the first 
25 stored representation; 

(b) await an indication f nDm the user as to whether the word is correct; 

(c) upon receipt of a positive indication to generate the said coded signal. 

In the event that a negative indication is received from the user, the analysis means may 
generate a message requesting repetition of the word, or may ^ ^ u j on 

30 (i) gener^e via the output means a message including the word represented by the second 3U 
stored representation and 
(ii) await an indication from the user as to whether the word is correct. 
The output means may be a visual display^ or could, be a speech synthesiser, 
The indication from the user may be Input by means of switches or a. keypad, but more 
35 preferably is by speaking appropriate words (eg "Yes" or "no") -which may then be analysed by 35 
the analysis means. . t*u 

One embodiment of the invention will now be described by way of example, with reference to 
the accompanying drawing which is a block diagram of a speech recognition apparatus. 
In the figure, speech from a user is received by a microphone 1 connected to a speech ^ 
40 recogniser 2. The recognlser compai^es received words with the contents of a pattern store 3 40 
which contains representations of a repertoirie of words which it is desired to recognise. 

Any of a number of conver>iional recognition algorithms may be used and these will not 
therefore be discussed in detail. By way of example the "VOTAN" recogniser card produced by 
Votan Inc. for use with an IBM PC microcomputer might be employed. a 
45 The recogniser 2 compares a received word with each of the stored representatione and _ 45) 
produces for each a similarity signal or "score" which indicates the closeness of fit between the 
two Normally the word whose stored representation has the best score is the one recog- 
nised" and a correst)onding coded signal is passed via line 4 to a control unit 5. whj«^ . 
for example be the aforementioned IBM computer, for onward transmission or to initiate further 
50 action, according to the purpose of the system. 

If, however, two stored representations have the same or similar scores, the [©suit Is arnbh 
guous and a Signal indicating this is passed via line 6, along with- codes for both words via Inie 
4 to the control unit 5 which responds by enerating a message back to the user via a speech 

55 ''rT^l^^^^^^ vou say X", where X is the word whose represen^^^^^^ 55 

stored m the pattern store gave rise to the better score (or, if the two ^^^'^^^^^^^^^ ^ 
one of the two selected at random); and awaits a reply. The synthesiser is assumed to have a 
parameter store 9 to enable it to generate appropnate words. . o cinnflllpd to the 

If the user replies "Yes" (or "No") this is recognised by the recogniser 2 and signalled to We 

60 control un'rs whfch, in the event of a Yes proceeds as if X had ^e^n ,identifie^^^^^^^^ 60 
event of a "No", a further message is issued via the symhesiser. viz Did you say Y Again 
Luser response is analysed and if.Y is confirmed, recognition is deemed complete; »f the user 
Sain repL "No", the control unit tiien initiates generation of a request for repetition (although 

. in principle, of course the third choice could be offered). .^.^^hnnp h»nkina ' 65 

65 By way of example; one possible use for such a system might be in a telephone banking bt> 
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service. Hera the control unit would be prpgrammed to generate questions to. the user, via the 
speech synthesiser, and to respond by generating further questions to ellcrt the required informa- 
tion to asserhble an instruction which may then be passed to a bank's staff or computer, for 
effecting a credit transfer, printing a statement, or the (ilce. 

A typical^et of words representations of wliich might be included in the pattern store 3 might 
be ^ 



10 



Services: 

"Statement" : -Order full statement 
"Balance" -Give Balance 
"Mini-statement -Last 4 transactions 



15 



20 



"Transfer" 

"Cheque-book'.' 

"Help" 

Account Types: 

"Current: 

"Savings" 

"One" 

"Two" 



-Tt^nsfer money between accounts 
-Order new cheque-book ' . 
-Request assistance from bank staff 



) Bank accounts 



) 

} Credit card accounts 



10 



15 



20 



25 



Amounts: 

"Ten" 

"Twenty" 

"Thirty" 

"Forty" 

"Fifty" 

"Full" 



30 Cancel: 
"Stop" 



) 
) 

) Amount, in pounds 
) 

). 

-iWake full. payment 



-Cancel service. (may be used during speech. output) 



A typical user-machine dialogue might proceed as follows (after an entry procedure with 
appropriate identity numbers and/of passwords-possibly accompanied by speaker recognition 
35 techniques for added security): 

'Which service do you require?" 
Transfer" (mispronounced) . 
*Dld you say 'statement"' 
40 (4) User: "No" - • 

"Did you say 'transfer'" . 
"Yes"- 

"From which account do you wish to transfer funds?" 

"Savings" * 

"45 (9) System: "Which account do you wish to transfer funds to?" 
"Current" (mispronounced). 
' 'Did you' say • 'Current'?" 
"Yes" 

"How muich.money (In pounds) do you wish to transfer from 
50 your savings account to your current account?" 

"Ten" (mispronounced) 
"Did you s6y 'twenty'?" 
"No" 

■"Did you. say 'thirty'?" 
55 (18) User "No" 

"HowVnuch money (in pounds) do you wish to transfer from 
your savings ^'ccoum to your current account?" 
"Teh" 

. "Ten .pounds will.be transferred from your savings account 
60 to your ctirreht account. Do* you. require another service?" 

Note the statements at lines 2, 10 and 14 where poor pronunciation, noise or the like has 
given rise ta an arhbtguity which has been resoh/ed^ in two cases by offering to the user the 
words judged to -be closest to the speech amually received. 
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1 A speech recognition apparatus comprising analysis means for receiving fpef^',^^'' 
from a user, comparing each received word with stored representations of w^i^* ^P™?^ 
similarity signals indicating the degree of conrespondence between them, and Pn>a«ang cooea 

5 signals identifying recognised words, and output means for presenting messages - 5 

analysis means being operable in the event that the similarity signals m r«spect of sto^ 
representation to which a receivad word most closely corresponds is equal to, or drffore by l^s 
than a predetermined margin from; the similarity signal in respect of a second stored representa- 

10 ""^generate via the output means a message Including the word represented by the first 10 
stored representation; 

(b) await an indication from the user as to whether the word is correct; 

(c) upon receipt of a positive indication to produce the said coded signal. 

2 An apparatus according to claim 1 in which the analysis means is arranged, upon receipt 

15 of a negative indication to (i) generate via the output means a message including the wwd la 
represented by the second stored representation and fii) await an indication from the user as to 
whether the word is correct. 

3 Ah apparatus according to daim 2 in which the output means is a speech synthesiser. 
4. A speech recognition apparatus substantially as herein described with reference to the 

20 accompanying drawing. . . . 
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